Shared genetics and causal association between plasma levels of SARS‐CoV‐2 entry receptor ACE2 and Alzheimer's disease

Abstract Background Alzheimer's disease (AD) is the highest risk of COVID‐19 infection, hospitalization, and mortality. However, it remains largely unclear about the link between AD and COVID‐19 outcomes. ACE2 is an entry receptor for SARS‐CoV‐2. Circulating ACE2 is a novel biomarker of death and associated with COVID‐19 outcomes. Methods Here, we explored the shared genetics and causal association between AD and plasma ACE2 levels using large‐scale genome‐wide association study, gene expression, expression quantitative trait loci, and high‐throughput plasma proteomic profiling datasets. Results We found a significant causal effect of genetically increased circulating ACE2 on increased risk of AD. Cross‐trait association analysis identified 19 shared genetic variants, and three variants rs3104412, rs2395166, and rs3135344 at chromosome 6p21.32 were associated with COVID‐19 infection, hospitalization, and severity. We mapped 19 variants to 117 genes, which were significantly upregulated in lung, spleen, and small intestine, downregulated in brain tissues, and involved in immune system, immune disease, and infectious disease pathways. The plasma proteins corresponding to LST1, AGER, TNXB, and APOC1 were predominantly associated with COVID‐19 infection, ventilation, and death. Conclusion Together, our findings suggest the shared genetics and causal association between AD and plasma ACE2 levels, which may partially explain the link between AD and COVID‐19.


| INTRODUC TI ON
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) had caused coronavirus disease 2019 (COVID-19), a devastating global pandemic.Evidence indicates that several medical comorbidities, such as chronic obstructive pulmonary disease, asthma, cardiovascular disease (CVD), diabetes, hypertension, and dementias, increase the risk of COVID-19 infection, hospitalization, and mortality. 1,2Importantly, the older population with dementia especially Alzheimer's disease (AD) are facing an unprecedented threat from COVID-19, and have the highest risk of COVID-19 infection, hospitalization, and mortality. 1,2Meanwhile, COVID-19 further increases the risk of AD. 3,4 Angiotensin-converting enzyme 2 (ACE2) is a protein on the surface of many cell types. 5It cuts up the larger protein angiotensinogen into small proteins that then go on to regulate functions in these cells. 57][8] It is believed that circulating ACE2 is generated from cell-membrane expressed ACE2, shed by ADAM-17 and other proteases. 8SARS-CoV-2 utilizes the catalytic site of full-length membrane-bound ACE2 for host cell entry, which is followed by viral internalization together with ACE2 and ACE2 degradation, accelerating the conversion from membrane-bound ACE2 to circulating ACE2. 5,9Therefore, the full-length membrane-bound ACE2 levels were markedly reduced and the circulating ACE2 levels were markedly increased upon SARS-CoV-2 infection, which have been widely reported in COVID-19 patients by observational studies. 10,11Importantly, high circulating ACE2 associated with increased COVID-19 severity and mortality, and could be used to predict severity and mortality. 10,11 addition to COVID-19, circulating ACE2 is also a novel biomarker of death and CVD. 12,13In patients with CVD, increased circulating ACE2 levels associate with adverse cardiovascular outcomes. 12In the general population, high circulating ACE2 levels associate with increased risk of total deaths, incident heart failure, myocardial infarction, stroke, and diabetes independent of age, sex, ancestry, and traditional cardiovascular risk factors. 13Compared with the well-established clinical risk factors (smoking, diabetes, blood pressure, lipids, and body-mass index), circulating ACE2 is the highest ranked predictor of death, and is also a strong predictor of CVD, including heart failure, stroke, and myocardial infarction. 13portantly, recent findings further support positive genetic association of circulating ACE2 with severe COVID-19, CVD, asthma, diabetes, and hypertension, as well as causal effect of circulating ACE2 on COVID-19 infection, hospitalization, and severity. 14llectively, these above findings show that circulating ACE2 shares a genetic basis with COVID-19 and its established risk factors, and could be a link of COVID-19 severity and mortality with its established risk factors.Evidence shows that AD pathology might aggravate the consequence of COVID-19 infection. 15However, it currently remains unclear about the genetic association between circulating ACE2 levels and the risk of AD.We consider that there may be shared genetic etiology between circulating ACE2 levels and AD, which may contribute to explain the highest risk of COVID-19 infection, hospitalization, and mortality in preexisting diagnosis of AD, as well as the increased risk of AD in COVID-19 patients.Here, we explore the shared genetic etiology between AD and plasma ACE2 levels.In stage 1, we examine the causal association between circulating ACE2 and AD using Mendelian randomization (MR).In stage 2, we identify the shared genetic variants using a cross-trait association analysis.In stage 3, we map the shared genetic variants to their corresponding genes, and conduct tissue-specific gene expression analysis, tissue-specific enrichment analysis, and gene set enrichment analysis.In stage 4, we investigate the association of shared genetic variants and their corresponding genes with COVID-19 outcomes.Figure 1 provides the schematic diagram of the study design in this study.

F I G U R E 1
The schematic diagram of the study design in this study.

Enrichment analysis
• Tissue-specific enrichment • Gene set enrichment

Shared genetic variation
• P AD < 0.05 (n = 63,926), stage 2 (n = 18,845), and stage 3A (n = 11,666) or stage 3B (n = 30,511 from stage 2 + stage 3A). 16AD is diagnosed using the same diagnostic criteria including DSM-III-R, DSM-IV, and NINCDS-ADRDA across the three stages. 16There is no clear evidence of cognitive or overall difference across different stages. 16In IGAP stage 1, a total of 9,456,058 common variants and 2,024,574 rare variants were imputed and selected for analysis. 16In IGAP stage 2, a total of 11,632 variants were further genotyped in 8362 AD cases and 10,483 controls, and were meta-analyzed with IGAP stage 1. 16 Here, we selected the IGAP stage 1 for MR analysis as it included the full genetic variants (n = 9,456,058), and IGAP stage 1 + stage 2 for cross-trait association analysis as it included the largest sample size (n = 82,777), respectively.

| Circulating ACE2 GWAS dataset
In order to understand the genetic basis of the ACE2 protein levels, Yang et al. 14 performed the largest GWAS meta-analysis of plasma ACE2 levels measured by Olink platform in 28,204 individuals from 14 cohorts in the SCALLOP consortium (Systematic and Combined Analysis of Olink Proteins).Here, we selected the circulating ACE2 GWAS dataset in both LDSC analysis and cross-trait association analysis.Yang et al. 14 only identified 10 independent genome-wide significant genetic variants including nine in the autosomes and one in the X chromosome, which together explain 4.1% of the phenotypic variance of plasma ACE2 equivalent to about 30% of the heritability.
In order to increase more autosomal genetic variants as the potential instrumental variables in MR analysis, we performed a clumping analysis of the plasma ACE2 GWAS dataset to select independent autosomal genetic variants with p < 1.00E-05 using TwoSampleMR v0.5.7 and two key parameters including clumping window 250 kb and clumping r 2 cutoff 0.01.

| Cross-trait meta-analysis
We conducted a cross-trait meta-analysis to identify the shared genetic variants in both AD and circulating ACE2 using METAL, which is a popular tool for meta-analysis of GWAS datasets. 20TAL provides two analysis schemes.One scheme, METAL combines the p values across different studies by fixed-effects sample size weighted meta-analysis taking into account the direction of effect. 20The other scheme, METAL combine effect size estimates and standard errors across different studies by fixed-effects inverse-variance weighted meta-analysis. 20Here, we selected both analysis schemes to identify the shared genetic variants reaching genome-wide significance p < 5.00E-08 for meta-analysis and suggestive trait-specific significance p < 0.05 for AD and circulating ACE2.

| Gene mapping
We aim to identify the risk genes corresponding to the shared genetic variants using both positional mapping and expression quantitative trait loci (eQTLs) mapping.For positional mapping, we map the shared genetic variants to the nearest genes using HaploReg v4.1. 21For eQTLs mapping, we identify risk genes whose expression might be regulated by the shared genetic variants using multiple publicly available eQTLs datasets from human whole blood, brain tissues, microglial cell, and other human tissues.Here, we selected 49 eQTLs datasets in 49 human tissues from Genotype-Tissue Expression Project (GTEx version 8), 22 1 large-scale eQTLs meta-analysis dataset in 1433 brain cortex samples, 23 4 eQTLs dataset in 255 primary human microglial samples isolated at autopsy from four different brain regions of 100 individuals with neurodegenerative, neurological, or neuropsychiatric disorders, as well as unaffected controls, 24 4 eQTL datasets in whole blood, including 31,684 individuals, 25 2765 individuals, 26 2116 individuals, 27 5257 and individuals. 28The statistically significant association is defined to be p < 1.00E-04.

| Tissue-specific gene expression analysis
Using all genes from both positional mapping and eQTLs mapping, we performed a tissue-specific gene expression analysis by FUMA v1.5.0, which is an online web application to annotate and prioritize genetic associations. 29FUMA evaluated gene expression and detected tissue-specific enrichment analysis using expression data from GTEx v8 54 tissue types. 29The gene expression value TPM (Transcripts Per Million) is an averaged expression value per tissue type per gene following to winsorization at 50 and log 2 transformation with pseudocount 1. 29 This kind of averaged expression allows for comparison across tissues and genes.

| Tissue-specific enrichment analysis
Using all genes from both positional mapping and eQTLs mapping, we performed a tissue-specific enrichment analysis by FUMA v1.5.0. 29ssue-specific enrichment analysis is tested using the differentially expressed genes (DEGs) defined for each tissue type of each expression dataset. 29First, gene expression values were normalized (zero-mean) following to a log 2 transformation of expression value (TPM). 29Second, DEGs were calculated by performing twosided t-test for any one of tissue type against all others.Only those genes with Bonferroni corrected p value ≤ 0.05 and absolute log fold change ≥0.58 were defined as DEGs. 29Third, tissue-specific enrichment analysis is performed to test if DEGs are overrepresented in any of tissue type against all others using the hypergeometric test.
Tissue types with Bonferroni corrected p value < 0.05 are defined to be significant enrichment of DEGs. 29

| Gene set enrichment analysis
We performed a gene set enrichment analysis of all genes from both positional mapping and eQTLs mapping using WebGestalt (WEBbased Gene SeT AnaLysis Toolkit), a functional enrichment analysis web tool. 30Here, we focused on the KEGG pathways in WebGestalt functional database. 30The hypergeometric test was used to detect any overrepresentation of the shared genes among all the genes in a given KEGG pathway. 30KEGG pathways with Bonferroni corrected p value < 0.05 are defined to be significantly enriched pathways.

| Association between genetic variants and COVID-19 outcomes
We investigated the potential association between the shared genetic variants and COVID-19 outcomes using large-scale GWAS datasets from COVID-19 Human Genetics Initiative, a global initiative to elucidate the role of host genetic factors in susceptibility and severity of the SARS-CoV-2 virus pandemic. 31We downloaded the GWAS summary statistics from COVID19-hg GWAS meta-analyses round 7 for four COVID-19 outcomes including infection: cases versus population (159,840 cases and 2,782,977 controls), hospitalization: hospitalized cases versus population (44,986

| Association between shared genes and COVID-19 outcomes
2 We explored the shared genes using COVID-19 Proteomics Data and Analytics Browser, which consisted of 1449 proteins associated with any of the three outcomes (841 for infection, 833 for ventilation, and 253 for death). 32

| Mendelian randomization analysis
We identified 70 independent autosomal genetic variants with p < 1.00E-05 by clumping analysis of the plasma ACE2 GWAS dataset using TwoSampleMR v0.5.7 and two key parameters including clumping window 250 kb and clumping r 2 cutoff 0.01, as provided in Table S1.Here, we selected these 70 genetic variants as the potential instrumental variables, and extracted their corresponding AD GWAS summary statistics from IGAP stage 1.Using IVW, we identified a significant causal effect of genetically increased circulating ACE2 level on increased risk of AD (OR = 1.12, 95% CI: 1.05-

| Cross-trait meta-analysis
Using fixed-effects sample size weighted meta-analysis, we found 19 genetic variants that were associated with both AD and circulating ACE2 at the genome-wide significance p < 5.00E-08 for the crosstrait meta-analysis and suggestive trait-specific significance p < 0.05 for AD and circulating ACE2 with the same directions of effect sizes (Table 1); 4, 1, 1, and 13 genetic variants are located at chromosome 6p21.32,8p21.2-p21.1,17p13.2, and 19q13.32,respectively.These genetic variants are in linkage disequilibrium with each other.
Using fixed-effects inverse-variance weighted meta-analysis, these 19 genetic variants were further verified, and three genetic variants rs9269853, rs2395166, and rs10415074 reached the genome-wide significance p < 5.00E-08, as provided in Table 1.

| Gene mapping using eQTLs analysis
eQTLs analysis not only confirms those findings from positional mapping, but also highlights some novel findings.Four genetic variants at chromosome 6p21.32

| Tissue-specific gene expression analysis
We got a total of 117 unique genes using positional mapping and/or eQTLs mapping.A total of 102 and 94 genes were recognized with recognized Ensembl ID and Ensembl ID in FUMA, respectively.Tissuespecific gene expression results are provided in Figure 2 as a gene expression heat map, which is clustered by both genes and tissues.The results showed that some genes were highly expressed across GTEx v8 54 tissue types, such as HLA-DRA, HLA-DRB1, CLU, KIF1C, BAG6, CLPTM1, MINK1, and ATF6B.Meanwhile, some genes were highly expressed specifically in some tissues, such as LST1 and AGER.LST1 only TA B L E 1 Shared genetic variants from cross-trait meta-analysis of AD and circulating ACE2 with p < 5.00E-08 and single trait p < 0.05.

F I G U R E 2
Heat map of tissue-specific gene expression of genes corresponding to shared genetic variants from cross-trait meta-analysis of AD and circulating ACE2.The heat map is plotted using FUMA v1.5.0 and gene expression data from GTEx v8 54 tissue types.The heat map was ordered by both gene and tissue clustering.Darker red represent higher expression of that gene compared to darker blue color across genes and tissues.
showed high expression levels in whole blood, spleen, and lung.AGER just showed high expression levels in lung and thyroid.

| Tissue-specific enrichment analysis
Using GTEx v8 54 tissue ty pes, DEGs are significantly enriched in lung, spleen, and small intestine with Bonferroni corrected p value < 0.05, which are highlighted in red as provided in Figure 3.
Interestingly, subgroup analysis using the upregulated DEGs and we have provided more detailed results from the tissue specificity test in Table S9.

| Gene set enrichment analysis
We identified 25 significantly enriched pathways as provided in  S10.
F I G U R E 3 Tissue-specific gene expression enrichment analysis of differentially expressed genes across GTEx v8 54 tissue types.Enrichment of differentially expressed genes was identified using FUMA v1.5.0 and gene expression data from GTEx v8 54 tissue types.Tissue types with Bonferroni corrected p value ≤ 0.05 are defined to be significant enrichment of differentially expressed genes, and are highlighted in red.

| Association between shared genetic variants and COVID-19 outcomes
Using the GWAS summary statistics from COVID19-hg GWAS meta- Note: The position is based on GRCh37/hg19.We define the suggestive association using p value < 0.05, and statistically significant using Bonferroni corrected p value < 0.05/12 = 4.17E-03, as three genetic variants and four COVID-19 outcomes.
effect alleles from rs3104412, rs2395166, and rs3135344 are consistent across AD, circulating ACE2, and COVID-19 outcomes.Here, we provided all association results in Table S11.
Until now, growing evidence showed the involvement of ACE2 and related genes in the serum or plasma of AD or other diseases related to aging.AD cases had decreased ACE2 activity in the serum compared with normal control individuals. 33Singh et al. 34 found the reduced levels of soluble ACE2 in plasma in stroke-operated mice compared to sham mice.Parkinson's disease (PD) patients significantly higher serum levels of ACE2 autoantibodies than controls. 35 B L E 3 Association between shared genes and COVID-19 outcomes.Note: We define the suggestive association using p value < 0.05, and statistically significant using Bonferroni corrected p value < 0.05/(117*3) = 1.42E-04, as 117 genes and three COVID-19 outcomes.
Here, we explored the shared genetic etiology between AD and plasma ACE2 levels by a comprehensive analysis.In stage 1, we conducted a causal association analysis.We found a significant causal effect of genetically increased circulating ACE2 level on increased risk of AD.Our current finding is consistent with recent study evaluating the genetic association between circulating ACE2 and other COVID-19 risk factors in both magnitude and direction. 14Yang et al. 14 36 However, they did not identify any significant genetic relation between circulating ACE2 and AD (rg = 0.0563, rg_SE = 0.0998, p = 0.573).Here, we used the largest AD GWAS in 94,437 individuals of European ancestry. 16Therefore, the large-scale AD GWAS dataset may contribute to identify more significant positive genetic relation.
In stage 2, we performed a cross-trait association analysis, and found 19 genetic variants that were significantly associated with both AD and circulating ACE2 at the genome-wide significance p < 5.00E-08 at chromosome 6p21.32,8p21.2-p21.1,17p13.2, and 19q13.32.In stage 3, we mapped these 19 genetic variants to 117 corresponding genes using positional mapping and eQTLs analysis.
Interestingly, growing evidence supports our current findings that these genes are associated with AD and/or COVID-19.At chromosome 6p21.32,HLA-DRA, HLA-DRB1, and HLA-DQA1 are also identified to be AD risk genes. 16A gene prioritization approach highlights HLA-DRB1, HLA-DRA, HLA-DQA1, HLA-DPA1, and HLA-DRB5 to be the top candidate genes among 46 genes in the MHC locus. 16  the NF-κB signaling pathway. 41Gene-based test of AD GWAS datasets have identified BCL3 to be an AD susceptibility gene. 42fferential gene expression analysis revealed a downregulation of BCL3 in COVID-19 patients compared to controls in lung, liver, kidney, and heart tissues. 43,44ssue-specific gene expression analysis showed that some genes were highly expressed across GTEx v8 54 tissue types, and others were highly expressed specifically in specific tissues, such as LST1 and AGER.Tissue-specific enrichment analysis suggested that these genes were significantly upregulated in lung, spleen, and small intestine, and downregulated in brain tissues.Gene set enrichment analysis highlighted significantly enriched pathways involved in immune system, immune diseases, and infectious diseases.Our findings are in line with the pathology observed in post-mortem tissues obtained from COVID-19 patients.COVID-19 causes multi-organ dysfunction, and predominantly affects the lung, and also harms other body organs including spleen, small intestine, heart, gut, liver, kidneys, and brain. 45,46 stage 4, we investigated the association of shared genetic variants and their corresponding genes with COVID-19 outcomes.
We identified three genetic variants rs3104412, rs2395166, and rs3135344 at chromosome 6p21.32that associated with COVID-19 infection, hospitalization, and severity.Importantly, these three genetic variants had the same directions of the effect alleles across AD, circulating ACE2, and COVID-19 outcomes.Meanwhile, we found that the plasma proteins corresponding to LST1, AGER, TNXB, and APOC1 were predominantly associated with COVID-19 infection, ventilation, and death.Interestingly, recent findings support the involvement of LST1, AGER, TNXB, and APOC1 in COVID-19.
Interestingly, recent findings support the involvement of LST1, AGER, TNXB, and APOC1 in COVID-19.A large-scale genome-wide analysis has identified LST1 to be a COVID-19 locus and a potential effector gene. 47Single-cell RNA-Seq datasets in COVID-19 patients suggested that LST1 may play a role in the effect of Angiotensin II receptor blocker on COVID-19-related mortality. 48Therefore, LST1 not only contributes to predict the COVID-19 outcomes, but also may be a potential COVID-19 treatment target.AGER is also named RAGE, and its plasma protein level was identified to be significantly upregulated in ICU COVID-19 patients compared to controls. 49High level of soluble RAGE is associated with a greater risk of mortality in COVID-19 patients treated with dexamethasone, 50 and is considered to be a biomarker of COVID-19 disease severity and indicator of the need for mechanical ventilation, acute respiratory distress syndrome and mortality. 51r current study still has some limitations.First, it is important to check the results of ACE2 and related genes in the serum in three stages of AD (early stage, middle stage, and late stage) and normal controls.However, there are no large-scale publicly available serum data from normal controls and AD including early stage, middle stage, and late stage.We will further evaluate the ACE2 and related genes in the serum when relevant data is publicly available in future.

Figure 4 .
Figure 4. KEGG pathway classifications shows that most of these pathway are associated with immune system and immune diseases, such as autoimmune thyroid disease, intestinal immune network for IgA production, type I diabetes mellitus, graft-versus-host disease, allograft rejection, and asthma.Meanwhile, other pathways are associated with infectious diseases, including staphylococcus aureus infection, leishmaniasis, herpes simplex infection, toxoplasmosis, Epstein-Barr virus infection, Influenza A, Tuberculosis, and Human T-cell leukemia virus 1 infection.Here, we have provided more detailed results from gene set enrichment analysis in TableS10.

37 F I G U R E 5
Cell specific peripheral immune responses indicate that HLA-DQA1, HLA-DRB5, and HLA-DPB1 are the most predictive of survival in CD16 monocytes from critical COVID-19 patients.Abundance distributions of plasma proteins corresponding to LST1, AGER, TNXB, and APOC1 in different COVID-19 outcomes.Box plots were plotted using COVID-19 Proteomics Data and Analytics Browser.(A) Abundance distributions of plasma proteins corresponding to LST1 in COVID-19 infection (all cases vs. healthy controls) with p value = 3.72E-48; (B) Abundance distributions of plasma proteins corresponding to LST1 in COVID-19 ventilation (cases requiring ventilation vs. cases without ventilation support) with p value = 2.82E-31; (C) Abundance distributions of plasma proteins corresponding to LST1 in COVID-19 death (died cases vs. survived cases) with p value = 2.40E-16; (D) Abundance distributions of plasma proteins corresponding to TNXB in COVID-19 infection (all cases vs. healthy controls) with p value = 1.23E-45; (E) Abundance distributions of plasma proteins corresponding to APOC1 in COVID-19 infection (all cases vs. healthy controls) with p value = 7.94E-29; (F) Abundance distributions of plasma proteins corresponding to AGER in COVID-19 death (died cases vs. survived cases) with p value = 5.75E-06.At chromosome 8p21.2-p21.1,EPHX2 encodes soluble epoxide hydrolase (sEH), a key enzyme for epoxyeicosatrienoic acid (EET) signaling.
tially explain the link between AD and COVID-19.Our findings have potential clinical implications.On the one hand, AD patients with plasma ACE2 levels may have increased risk of COVID-19 infection, hospitalization, and mortality, and assessment of plasma ACE2 levels may be a means of identifying AD patients at high risk for adverse COVID-19 outcomes.On the other hand, COVID-19 patients with plasma ACE2 levels may have increased risk of AD, and assessment of plasma ACE2 levels may be a means of identifying COVID-19 patients at high risk of AD.AUTH O R CO NTR I B UTI O N SGYL, YZ, and YC conceived and initiated the project.GYL, YZ, TW, and ZFH analyzed the data, and wrote the first draft of the manuscript.All authors contributed to the interpretation of the results and critical revision of the manuscript for important intellectual content and approved the final version of the manuscript.
Gene set enrichment analysis of genes corresponding to shared genetic variants from cross-trait meta-analysis of AD and circulating ACE2.Gene set enrichment analysis was performed using WebGestalt (WEB-based GEne SeT AnaLysis Toolkit).KEGG pathways with Bonferroni corrected p value < 0.05 are defined to be significantly enriched pathways.
38,39sEH inhibition or Ephx2 deletion delays AD progression and alleviates AD pathology in mouse models of AD.38,39Evidence from 50 COVID-19 patients and 94 age-and sex-matched controls shows that SARS-CoV-2 serum had signifi-